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We calculate the likelihood map in the full 7 dimensional parameter space of the minimal super- 
symmetric standard model (MSSM) assuming universal boundary conditions on the supersymmetry 
breaking terms. Simultaneous variations of mo, Ao, M1/2, tan/3, m t , mt and a s (Mz) are applied 
using a Markov chain Monte Carlo algorithm. We use measurements of b — > S7, (g — 2) M and 
QdmIi 2 in order to constrain the model. We present likelihood distributions for some of the spar- 
ticle masses, for the branching ratio of B® — > fi + fi~ and for mf — rn x o. An upper limit of 2xlCP 8 
on this branching ratio might be achieved at the Tevatron, and would rule out 29% of the currently 
allowed likelihood. If one allows for non thermal-neutralino components of dark matter, this fraction 
becomes 35%. The mass ordering allows the important cascade decay — > X2 — > Ir — > Xl with a 
likelihood of 24±4%. The stop coannihilation region is highly disfavoured, whereas the light Higgs 
region is marginally disfavoured. 
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INTRODUCTION 



Weak-scale supersymmetry provides a well- 
documented solution to the technical hierarchy 
problem which is particularly difficult to solve 

in a perturbatively calculable model. Specialising to a 
minimal extension of the Standard Model, the MSSM, 
one can provide a weakly interacting massive particle 
dark matter candidate, provided R— parity is respected 
by the model. Examples of dark matter candidates 
arc the gravitino 0, the axino [i| and the lightest 
neutralino B, the subject of much recent investiga- 
tion @, 0, [E M, M, EH- Th e general MSSM is rather 
complicated due to the large number of free parameters 
in the supersymmetric (SUSY) breaking sector. How- 
ever, the observed rareness of flavour changing neutral 
currents (FCNCs) suggests that the vast majority of 
parameter space for general SUSY breaking terms is 
ruled out. Particular patterns of SUSY breaking param- 
eters can postdict small enough FCNCs: for instance 
flavour universality. One highly studied subset of such 
terms is that of mSUGRA, often called the Constrained 
Minimal Supersymmetric Standard Model (CMSSM). 
In mSUGRA, at some high energy scale (typically 
taken to be the scale of unification of electroweak gauge 
couplings), all of the SUSY breaking scalar mass terms 
are assumed to be equal to mo, the scalar trilincar 
terms are set to Aq and the gaugino masses are set 
equal (M 1 / 2 ). These are indeed strong assumptions, 
but they have several advantages for phenomenological 
analysis as the number of independent SUSY breaking 
parameters is much reduced. Indeed, assuming that 
the MSSM is the correct model, the initial data from 
the Tevatron or Large Hadron Collider are likely to 
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contain only a few relevant observables and so one may 
be able to fit against a simple SUSY breaking model 
as an example [12]. As the data become more accurate 
and additional relevant observables are measured, the 
lack of a good fit would propel extensions of the simple 
model. One may then start to consider patterns of 
non-universality, for instance. For the rest of this paper 
though, given the lack of data to the contrary, we will 
assume mSUGRA. Aside from the universal soft terms 
mo, Ao, Mi/2, other non-standard model mSUGRA 
input parameters are taken to be tan/3, the ratio of the 
two Higgs vacuum expectation values, and the sign of p, 
(a parameter that appears in the Higgs potential of the 
MSSM). 

When combined with large scale structure data, the 
Wilkinson microwave anisotropy probe (WMAP) [H, [3 
has placed stringent constraints upon the dark matter 
relic density floAih 2 . A common assumption, which we 
will adhere to here, is that the neutralino makes up the 
entire cold dark matter relic density. The prediction of 
the relic density of dark matter in the MSSM depends 
crucially upon annihilation cross-sections, since in the 
early universe SUSY particles will annihilate in the ther- 
mal bath. Regions of mSUGRA that are compatible with 
the WMAP constraint often predict some of the following 
annihilation channels [l5l |: 

• Stau (f ) co-annihilation [l6| at small mo where the 
lightest stau is quasi-degenerate with the lightest 
neutralino (xi)- 

• Pseudoscalar Higgs (^4°) funnel region at large 
tan/3 > 45 where two neutralinos annihilate 
through an s-channel A resonance [l?], [H| ■ 

• Light CP-even Higgs (h°) region at low Mi/ 2 where 
two neutralinos annihilate through an s-channel h 
resonance [l7l [l9j. 

• Focus point 20, 2l|, [22| at large m where a sig- 
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nificant Higgsino component leads to efficient neu- 
tralino annihilation into gauge boson pairs. 

• Stop co-annihilation [23|, [24], [25[ at large (—A ), 
where the lightest stop is close in mass to the light- 
est neutralino. 

Many pre-WMAP analyses focused on the so-called bulk 
region [17]. The bulk region is continuously connected 
to the stau co-annihilation region at low mo and M 1 / 2 - 
There are two reasons why the bulk region has shrunk in 
size when one takes the current constraints into account: 
the WMAP constraint upon the relic density has ruled 
much of the region out, and the new low value of the 
top mass mean that the MSSM Higgs mass predictions 
are sometimes too low for low mo and Mjm and are ruled 
out by LEP2 constraints. The (now reduced) bulk region 
will make an implicit appearance in our results, and we 
will comment upon this fact later. 

Several authors @, [H, [13, H [U, 13 EI have asked 
how the annihilation cross-section can be constrained by 
collider measurements in order to provide a more solid 
prediction of the relic density. This would then be fed 
into a cosmological model in order to predict ^InAfh 2 
for comparison with the value derived from cosmological 
observation, allowing a test of cosmological assumptions 
(and the assumption that there is only one component of 
cold dark matter). Of course, colliders could not unam- 
biguously identify the lightest observed SUSY particle as 
the dark matter since it could always decay unobserved 
outside the detector. It would therefore be interesting 
to combine collider information with that derived from 
a possible future direct detection [32j of dark matter, 
providing corroboration and additional empirical infor- 
mation. Before such observations are made, however, we 
may ask how well current data constrain models of new 
physics. 

This question has been addressed many times for 
mSUGRA by using the dark matter constraint. Most of 
the analyses (see, for example [|, 0, [H H HI, S S3, HI] ) 
fix all but two parameters and examine constraints upon 
the remaining 2 dimensional (2d) slice of parameter 
space. The dark matter relic density constraint is the 
most limiting, but the branching ratio of the decay 
b — > S7 and the anomalous magnetic moment of the muon 
(g — 2) M also rule part of the parameter space out. Re- 
cent upper bounds from the Tevatron experiments on the 
branching ratio B s — ► /i+ [i~ [4(j have the potential to re- 
strict mSUGRA in the future, but the analysis of ref. [4ll ] 
shows that the resulting constraints currently subsumed 
within other constraints. In the above analyses, limits 
are typically imposed separately, each to some prescribed 
confidence level. Such analyses have the advantage of be- 
ing quite transparent: it is fairly easy to see which con- 
straint rules out which part of parameter space. However, 
they have the disadvantages of not properly describing 
the combination of likelihoods coming from different ex- 
perimental constraints and of having to assume ad hoc 
values for several input parameters. In particular, as well 



as the soft SUSY breaking input parameters, the bottom 
mass ruf,, the strong structure constant as(Mz) and the 
top mass m t can all have a strong effect on mSUGRA pre- 
dictions. A large random scan of flavour diagonal MSSM 
space involving 10 points that pass various prescribed 
constraints was presented in ref.[42j, however the sam- 
pling of the 20d parameter space was necessarily sparse. 
The analysis is also subject to the limitation that like- 
lihoods have not been combined; instead the measure- 
ments have been used as cuts to discard points. In ref. 0], 
the likelihood from the observables is calculated, prop- 
erly combining different constraints, but again Id and 2d 
slices through parameter space were taken. Of course the 
time taken to efficiently sample from a likelihood distri- 
bution using the naive method (a scan) scales like a power 
law with respect to the number of parameters, meaning 
that in practice even a high resolution 3d scan is diffi- 
cult. By parameterising lines in 2d that are consistent 
with the WMAP dark matter constraint and scanning in 
two other parameters, the analysis of ref. [ll| calculates 
the x 2 statistic for the 2d part of a 3d parameter space 
which is consistent with the WMAP constraint on the 
dark matter relic density. The predicted value of fl^Mh 2 
is not combined in the \ 2 with the other observables for 
this analysis, and the parameter tan (3 must be fixed. As 
the authors note [ll|, parts of the scan were sparse. In 
Ref. [HI , a scan was performed which included variations 
of Ao and tan j3 as well as other mSUGRA parameters. It 
is clear from this paper that the WMAP allowed region 
(expressed in the M 1 / 2 -rno plane) becomes much larger 
from the Aq variations. No likelihood distribution was 
given. 

Baltz and Gondolo (44| demonstrated that a Markov 
chain Monte Carlo (MCMC) algorithm efficiently sam- 
ples from the mSUGRA parameter space, rendering 4d 
scans in mo, Aq, M1/2, tan/3 feasible. However, they were 
interested in which parts of parameter space are com- 
patible with the WMAP measurement of fl^Af/i 2 and 
what the prospects are for direct detection there, not in 
the likelihood distribution. In order to increase the ef- 
ficiency of their parameter sampling, they changed the 
simple "Metropolis-Hastings" MCMC algorithm in order 
to achieve a better efficiency. As the authors state in 
their conclusions, this has the consequence that caution 
must be exercised when trying to interpret their results 
as a likelihood distribution. Indeed, we will show in a 
toy model that changes to the MCMC algorithm like the 
ones that Baltz and Gondolo made can alter the sampling 
from a distribution. 

It is our purpose here to utilise the MCMC algorithm 
in such a way as to reliably calculate the combined like- 
lihood of mSUGRA in the full dimensionality of its pa- 
rameter space, thereby extending the previous studies. 
We will then be able to infer what is known about the 
multi-dimensional parameter space, including important 
variations of the SM quantities. These results will have 
implications for collider searches and rare decays. 

In section [Til we briefly review the MCMC algorithm. 
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We present the implementation used in the present paper 
to calculate the likelihood maps of mSUGRA parameter 
space and then demonstrate that the results are conver- 
gent using a particular statistical test. In section HTT1 we 
present the likelihood distributions and derived quanti- 
ties of the 7d mSUGRA parameter space. In section HVl 
we illustrate the effects of theoretical uncertainties in 
the sparticle spectrum calculation and in section [VI pos- 
sible effects from allowing an additional non thermal- 
neutralino component to the relic density are explored. A 
summary and conclusions are presented in section fVTl In 
appendix [AJ we demonstrate with two different toy mod- 
els that the algorithm used by Baltz and Gondolo may 
not provide a sampling proportional to the likelihood of 
the parameter space. 

II. IMPLEMENTATION OF THE MCMC 
ALGORITHM 

A. Likelihoods 

Some readers might be unfamiliar with the use of 
statistics in this paper, and so we include some com- 
ments on how to interpret them. The likelihood is not 
dependent upon any priors. The likelihood £ = p{d\m) 
is the probability density function (pdf) of reproducing 
data d assuming some mSUGRA model m. In p(d\m), 
the model m is specified by the mSUGRA input parame- 
ters and so p(d\m) has a dependence upon them. p(d\m) 
is related to the pdf of the model being the one chosen 
by nature, given the data, by an application of Bayes' 
theorem: 

p( m \d)=p(d\m)^-, (1) 
p(d) 

where p{m) is the probability of the model being cor- 
rect (the prior) and p(d) is the total probability of the 
data being reproduced, integrating over all possible mod- 
els. p(d) is practically impossible to estimate, so we can- 
not get the quantity that one really wishes to estimate, 
p(m\d). However, we may compare the relative probabil- 
ities of two different models mi and m,2 (corresponding 
here to different points of mSUGRA space) by applying 
Eq. [1] for each model, implying that 

P(rrn\d) = p{d\m l )p(m l ) 
p(m 2 \d) p{d\m 2 )p(m 2 )' 

We note here the appearance of the infamous prior dis- 
tributions p(mi),p(m2)- If one assumes that the ratio 
of these two priors is one (that no region of parameter 
space is more likely than any other), one may interpret 
the likelihood ratio of two different points in mSUGRA 
space as the ratio of probabilities of the models, given 
the data. In this paper however, we provide likelihood 
distributions. If the reader prefers a different ratio of pri- 
ors to one, they must convolute the likelihood density we 
give with their preferred ratio of pdfs. 



B. The MCMC Algorithm 

We now briefly review the Metropolis MCMC al- 
gorithm, but for a more thorough explanation, see 
refs. 0, 5f| . Other adaptive scanning algorithms have 
recently been suggested in the context of high energy 
physics [4(il [47| ] but (although they can be very useful 
for other purposes) they do not yield a likelihood distri- 
bution. A Markov chain consists of a list of parameter 
points (x^*)) and associated likelihoods (C^ = C(x^')). 
Here t labels the link number in the chain. Given 
some point at the end of the Markov chain (x^), the 
Metropolis-Hastings algorithm involves randomly picking 
another potential point (x^* +1 ^) (typically in the vicinity 
of x™) using some proposal pdf Q(x; x^). In this paper 
we will specialise to the case of symmetric proposal func- 
tions, i.e. Q(x 6 ;x Q ) = Q(x a ;x b ). If £(* +1 ) > £«, the 
new point is appended onto the chain. Otherwise, the 
proposed point is accepted with probability £(* +1 ) /£*■*) 
and, if accepted, added to the end of the chain. If the 
point x( t+1 ) is not accepted, the point x(*) is copied on 
to the end of the chain instead. 

Providing "detailed balance" is satisfied, it can be 
shown [45J that the sampling density of points in the 
chain is proportional to the target distribution (in this 
case, the likelihood) as the number of links goes to in- 
finity. In the context of this analysis, detailed balance 
states that for any two points x Q , Xb 

T , (x ;X6)£(x 6 ) = T(x 6 ;x a )£(x ), (3) 

where T(xf,;x a ) = Q(xf,;x a ) x min(l,£(xi,)/£(x )) is 
the probability of making a transition from x a to x& in 
the case where the proposal function is symmetric. Thus, 
the probability of sampling a point x a from the likelihood 
distribution and then making a transition to x& be equal 
to the probability of sampling Xb and making a transition 
to x . 

The Metropolis-Hastings MCMC algorithm is typically 
much more efficient than a straightforward scan for D > 
3; the number of required steps scales roughly linearly 
with D rather than as a power law. The sampling is in 
principle independent of the form of Q as t — > oo as long 
as it is bigger than zero everywhere. However, Q must 
be chosen with some care: since in practice we can only 
sample a finite number of points, the choice of the form 
of Q can determine whether the entire parameter space 
is sampled and how quickly convergence is reached. 

Baltz and Gondolo used a geometrical model for Q: 
choosing a random distance from the point and us- 
ing a direction that was calculated from the positions of 
previous points in the chain. The width of the random ra- 
dius pdf was calculated depending upon previous points 
in the chain in order to increase the efficiency of the calcu- 
lation, aiming to accept roughly 25% of potential points. 
Either of these changes upset detailed balance and may 
spoil the sampling. We demonstrate in Appendix [Al with 
toy models that the width changing modification gives a 
sampling that is not proportional to the target density. 
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TABLE I: Parameter ranges considered. 



parameter 


range 


sign(/x) 


+1 


A 


-2 TeV-2 TeV 


m 


60 GeV-2 TeV 


M 1/2 


60 GeV-2 TeV 


tan (3 


2-60 



TABLE II: Lower bounds applied to sparticle mass predic- 
tions in GeV. 



mo 


37 


m ± 


67.7 


m- g 


195 




76 




88 




86.4 


m ~ bl 


91 




250 




43.1 















C. Parameter Ranges 

On general naturalness grounds [8(j, we expect too and 
Mi/2 to not be too large: less than say, 2 TeV. mSUGRA 
is ruled out by negative results in sparticle searches for 
mo < 60 GeV or M l/2 < 60 GeV. \x > is favoured 
by the measurement of the anomalous magnetic moment 
of the muon. tan (3 is bounded from below by negative 
searches at LEP2 for h° (and perturbativity of the top 
Yukawa coupling) and from above by perturbativity of 
the Yukawa couplings up to the unification scale. 

Upper bounds upon Too can exclude the focus point re- 
gion, which, in our calculation, is at much higher values 
of mo ~ 0(8) TeV. It has been argued that a quantitative 
measure of fine-tuning in the focus point region is not too 
large [13, HH , however the fine-tuning of the top quark 
Yukawa coupling is enormous [48| . This makes the focus- 
point regime practically impossible to reliably calculate 
starting from mSUGRA inputs. Tiny higher order effects 
in the top Yukawa coupling strongly change the position 
of the focus point regime in mSUGRA parameter space. 
In ref . 48] , it is demonstrated that the focus point regime 
moves in the m direction by several TeV depending on 
how exactly the highest order top-quark Yukawa radia- 
tive corrections are calculated. Because the calculation 
cannot be controlled with the current state-of-the-art cal- 
culations of the top Yukawa coupling, we will exclude the 
focus point regime from this analysis by placing an ap- 
propriate upper bound upon mo. Here, we restrict the 
parameter space to that shown in Table [U 



D. Observables and constraints 

We calculate the MSSM spectrum from mSUGRA pa- 
rameters, by using the program S0FTSUSY1 . 9 . 2 [491 ] . 
Ideally we would like to include data from negative 
search results from collider data within a combined like- 
lihood. Unfortunately it is difficult to obtain the data 
in such a form and so instead, we assign a zero like- 



lihood to any point for which at least one of the con- 
straints [5(| in Table [TT| is not satisfied. We also imple- 
ment a parameterisation [8l] of the 95% confidence level 
limits [52] on m h (g hZ z / g^zz), where ghzzlgfMz is tlic 
ratio of the MSSM higgs coupling to two Z° bosons to 
the equivalent Standard Model coupling. In order to take 
a 3 GeV uncertainty on the mSUGRA prediction of mn 
into account, we add 3 GeV [U [H| to the to^o value 
that is used in the parameterisation. In the MSSM, 
9hzz / 9hzz ~ sm (P ~ a ) and m practice, it is easier 
to apply limits in terms of the inverse parameterisation 
sin 2 (P — a) (to^o ) as shown in Table IIIIl 

The spectrum is transferred via the SUSY Les 



Houches Accord [5 
micrOMEGAsl.3.5 \fT 




to the computer program 
56j in order to calculate several 
quantities used to calculate the likelihood of a parameter 
point. We will use six measurements in order to construct 
the final likelihood of any given point of parameter space. 
As mentioned in the introduction, we make the assump- 
tion that the neutralino makes up the entire cold dark 
matter relic density as constrained by WMAP: 



fl DM h 2 = 0.1126; 



-0.0081 
-0.0091- 



(4) 



The anomalous magnetic moment of the muon has been 
measure d 1571 t o be higher than the Standard Model pre- 
diction 

MM- 

The experimental measurement is so 
precise that the comparison is limited by theoretical un- 
certainties in the Standard Model prediction. Following 
Ref. (60j . we constrain any new physics contribution to 
be 



■ {g 



2) M 



= 19.0 ±8.4 x 10" 



(5) 



Adding theoretical errors [6l| to measurement errors [6j 
in quadrature for the branching ratio for the decay b — 
sj, one obtains the empirically derived constraint 



BR(b -> s 7 ) = 3.52 ± 0.42 x 10" 



(G) 



The Standard Model inputs' measurements also con- 
tribute to the likelihood. We take these to be for 
the running bottom quark mass in the modified minimal 
subtraction scheme, 

m b (m b j M? = 4.2 ±0.2 GeV, (7) 

for the pole mass of the top quark [83] [63|, 

m t = 172.7 ±2.9 GeV, (8) 

and for the strong coupling constant in the modified min- 
imal subtraction scheme at Mz 

OLsiMz) 1 ^ = 0.1187 ±0.002. (9) 
A prediction pi of one of these quantities, where 

2)m/2, 



{a s {M z ) MS , m u m b {m b ) MS , {g - 
BR(6 — > s"f),fl DM h 2 } 



(10) 
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TABLE III: Parameterisation of 95% condidence level LEP2 Higgs limits on the m h o-sin 2 (/3— a) plane. All points with m h o < 90 
GeV are ruled out. 



m h o /GeV range 


upper bound on sin 2 (/3 — a) 


90-99 
99-104 
104-109.5 
109.5-114.4 


-6.1979 + 0.12313 m h o/GeV - 0.00058411 (m^/GeV) 2 
35.73 - 0.69747 m h o /GeV + 0.0034266 (m h o/GeV) 2 
21.379 - 0.403 m h o/GeV + 0.0019211 {m h o/GeV) 2 
1/(60.081 - 0.51624 m h0 /GeV) 



with measurement rrii ± Si yields a log likelihood 

In A = ~ K ~f ^ ~ ~ M2tt) - hxa u (11) 

assuming the usual Gaussian errors. Note that since 
Eq. g]has asymmetric errors, SQ DMh 2 = 0.0081(0.0091) 
if our prediction is higher (lower) than the observed cen- 
tral value. To form the combined likelihood, one takes 
ln£*°* = 2Z i=1 InjCi, corresponding to the combination 
of independent Gaussian likelihoods. In practice, we will 
ignore the normalisation constants — h ln(27r)— Ins,, since 
the likelihood distribution has an arbitrary normalisation 
anyway. 

We take the proposal function to be the product 
of Gaussian distributions along each dimension k — 
1, 2, . . . , D centred on the location of the current point 
along that dimension, i.e. x k : 

Q( X ( t+1 );x (t) ) = ft -^Ue-^-^fM (12 ) 
k=i v2^fc 

where Ik denotes the width of the distribution along di- 
rection k. By trial and error we find that using values 
of Ik that are equal to the parameter range of dimension 
k given in Table Q] divided by 25 works well. For the 
Standard Model inputs, we choose Ik = 8t7fc/25. 

In order to start the chain we follow the following pro- 
cedure, which finds a point at random in parameter space 
that is not a terrible fit to the data. We pick some y(°) 
at random in the mSUGRA parameter space using a flat 
distribution for its pdf. The Markov chain for y is evolved 
a sufficient number of steps (t) such that ln£(yO) > —5, 
i.e. the initial chain has found a reasonable fit. We then 
set x(°) = y W , continuing the Markov chain in x and dis- 
carding the "burn-in" chain y. The reasonable-fit point is 
typically found long before 2000 iterations of the Markov 
chain. 



E. Convergence 

In order that likelihood distributions calculated in this 
paper be considered reliable, it is important to check con- 
vergence of the MCMC. This is done by running 9 inde- 
pendent Markov chains, each with random starting po- 
sitions as described above. The starting positions are 
chosen in the ranges presented in Table U with a flat 



pdf, since it is important for the convergence measure 
that the initial values be over distributed compared to 
the likelihood function one samples from. By examin- 
ing the variance and means of input parameters within 
the chains and between the 9 different chains, we will 
construct a quantity [64| R- R will provide an upper 
bound on the factor of expected decrease of variance of 
Id likelihood distributions if the chain were iterated to an 
infinite number of steps. An R value can be constructed 
for scalar quantities that are associated with a point x^. 

The analysis of the R convergence statistic follows 
Ref. [lij closely. We consider c = 1, . . . , M chains 
(M = 9 here), each with N = 10 6 steps. Then we may 
define the average input parameter along direction k for 
the chain c and the average amongst the ensemble of 
chains 

1 N 1 M 

t—l c=l 

respectively. The variance of chain c along direction k is 
1 N 

rac = ¥3I ^([4 i3 ]c-[S fc ]c) 2 , (14) 
t=i 

so that we have the average of the variances within a 
chain 

1 M 

wk = m ( 15 ) 

c—l 

and the variance between chains' averages 
1 M 

c=l 

The basic ratio constructed corresponds to 
Rk= M^ Wk+Bk/N{1 + ^_ 

Wk 

As long as the initial seed parameters of the Markov chain 
are over-distributed, i.e. they have larger variance than 
the likelihood, this ratio will be larger than one [64| if 
the chains have not converged or if they have not had 
time to explore the entirety of the parameter space. It 
tends to one only if both of these conditions are met. 
In order to construct i?, we must take into account the 
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sampling variability of [xk) c and [T4] c . The variance of 
chain variances along direction k is estimated to be 



t'k 



M 



1 M 



(18) 



and we must take into account the following estimates of 
co- variances between the values of [xk]c and [Vk] c - 

1 M 

(CTfc)l = -WkX 2 k + JJ^2[Vk]c[xk]l, 



c=l 



i.CFk)l 



~WkX k 



(19) 



Defining the total estimated variance of the target distri- 
bution along direction k 

2 <JV ~SPN + (Ml ~ 2i ^' ) ' (20) 
where we have degrees of freedom 



dfk = 2 



(R k w k + B k /(NM)) 2 



(21) 



leading us to the final equation for the estimated reduc- 
tion in the sampled variance as t — > oo: 



Rk = Rk 



dfk 



df k -2' 



(22) 



Here, we define r =m&Xk{vRk}- Values of r < 1.05 
are considered to signify convergence and compatibility of 
the chains, since we could only hope to decrease the scale 
of any of the input parameter distributions by at most 
5% by performing further Markov Chain steps. In Fig.[TJ 
we show the quantity rasa function of step number, r < 
1.05 is met already for 600 000 steps indicating adequate 
convergence, although for the results we present below 
we we always use the full 9x1 000 000 sample. 



III. LIKELIHOOD MAPS 

The number of input parameters exceeds the number of 
data and the likelihood shows a rough degeneracy along 
directions which give iso-lines of floMh 2 . The parame- 
ters of the best-fit point of the MCMC do not therefore 
supply us with much information, but the value of the 
likelihood at that point is interesting: a very small value 
would indicate a high % 2 and therefore a bad fit. The 
best-fit point sampled by the MCMC with 7d input pa- 
rameter space was 

m = 964 GeV, M 1/2 = 341 GeV, A = 1394 GeV, 

m b {m b ) = 4.18 GeV, m t = 173.0 GeV, 

a s {M z ) = 0.1185, tan (3 = 57.9, (23) 




10 20 30 40 50 60 70 80 90 100 
step/ 10000 

FIG. 1: Estimate of potential scale reduction shown as a func- 
tion of the number of Markov chain Monte Carlo steps. The 
upper bound we require for convergence is shown as a hori- 
zontal line. 



leading to predictions of S(g — 2)^/2 — 1.8 x 10 9 , 
BR(b -> s-y) = 3.63 x 10~ 4 and fl DM h 2 = 0.1124 and 
corresponding to a combined likelihood (ignoring normal- 
isation constants, as stated above) of £ = 0.95. The point 
is within the A — pole region, and the centrality of pre- 
dicted observables gives us confidence that mSUGRA can 
fit well to current data. The efficiency of the MCMC al- 
gorithm is quite small: only 4.1%. This reflects the fact 
that the thickness of WMAP-allowed volume is small, 
making it difficult to sample efficiently. 

We display binned sampled likelihood distributions in 
Figs.[2M2F for the full m tl m b (m b ), a s (M z ) MS , m , A , 
tan/3 and M 1 / 2 parameter space. The unseen dimensions 
in each figure have been marginalised with flat priors. All 
2-d or 1-d marginalisations in this paper assume a flat 
prior (within the ranges of parameters considered in Ta- 
ble HJ in all unseen dimensions (or in other words, the 
likelihood is integrated over them with equal weight). 
One can view the marginalisation probabilistically, or 
just as a means of viewing the higher dimensional pa- 
rameter space. We have used 75x75 bins, normalising 
the likelihood in each bin to the maximum likelihood in 
any bin in each 2d plane. 

In each plot, the h pole s-channel resonant annihila- 
tion region is present close to the lowest values of Mij 2 . 
It can be seen as a vertical sliver in the top-left hand 
corner of the mo — M-^ji as in Fig. [2^ and the slim band 
ranging across the bottom of Figs. [2J1J2F. The bright 
region in Fig. ®l at low values of mo is primarily a co- 
annihilation region where slepton-neutralino annihilation 
contributes significantly to the depletion of the neutralino 
relic density in the early universe. However, at the low- 
est values of mo and Mi/ 2 values visible on the graph, 
the bulk region resides, being continuously connected to 
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FIG. 2: Likelihood maps of mSUGRA parameter space. The graphs show the likelihood distributions sampled from 7d 
parameter space and marginalised down to two. The likelihood (relative to the likelihood in the highest bin) is displayed by 
reference to the bar on the right hand side of each plot. The contours show the 68% and 95% confidence level limits. 



the co-annihilation tail, as shown in Ref. 7] . The pseu- 
doscalar Higgs (A ) s-channel annihilation channel oc- 
curs at high tan (3 — 50 — 60 and in the intermediate 
areas of m = 500 - 1600 GeV, M 1/2 = 250 - 1400 
GeV. In the literature, the most common way to dis- 
play mSUGRA results is to present them in 2d in the 
rriQ-Mi/2 plane, where thin strips are observed (see for 
example Ref. [11() that are consistent with the WMAP 



constraint upon fluMh 2 . Fig. [2^ demonstrates (in cor- 
roboration with Refs. [HI Hi) that the strips are truly a 
result of picking a 2d hyper-surface in parameter space: 
if one performs a full multi-dimensional scan, there is a 
large region in the rriQ-Mi / 2 plane that is consistent with 
the data. The bottom right hand side corner of Fig. [2^ 
is ruled out primarily by the fact that the lightest su- 
persymmetric particle (LSP) is charged. Large My 2 is 
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disfavoured by the (g— 2) M result. The bottom left-hand 
corner of Fig. is ruled out by a combination of dark 
matter and direct search constraints. We see an inter- 
esting correlation between mo and tan/3 in Fig. [2b: the 
region extending to low tan (5 and low mo is essentially 
the stau co- annihilation/bulk region. 

We display some binned likelihood distributions of 
MSSM particle masses in Figs. [3ji-|3j2 that are relevant 
for dark matter annihilation. In Fig. [3^, the A°-pole res- 
onance region is clearly discernable: at just above a line 
2m x o = M40 and just below it, there is just enough anni- 
hilation to produce the observed relic density. Through- 
out much of the parameter space (Ma < 1 TeV) , the ex- 
act resonance condition depletes the relic density ^loAth 2 
to be too small. At around Ma ~ 1 TeV, exact reso- 
nance is required in order to sufficiently deplete the relic 
dark matter density. The rest of the likelihood density is 
spread over the dark part of the plot and is mostly too 
diffuse to be visible. In Fig. [3p, there are two detectable 
regions: the small one with the maximum 2d binned like- 
lihood at the lowest possible values of m x o corresponds 
to the /i°-pole region. The larger upper high likelihood 
region is an amalgam of the co-annihilation and A°-po\e 
regions. As a by-product we see that values of rrih > 126 
GeV are disfavoured in the mSUGRA model. In Fig. [3b, 
the stau co-annihilation region is visible as the diago- 
nal dark line and the h°-po\e region as the lower like- 
lihood horizontal dark line. The rest of the likelihood 
density is diffusely distributed in between these two ex- 
tremes. We have shown the likelihood distribution for 
lightest chargino and slepton masses in Fig. [3bl. Unfor- 
tunately, most of the likelihood is in a region where the 
well-known tri-lepton search channel [651 ] at the Tevatron 
is rather difficult. With 8 fb _1 of integrated luminosity, 
this search channel requires m ± < mr. m ± < 250 for a 
discovery [66|. 

We show the sampled likelihood distribution for mj , 
rrig, and mq L in Fig. The likelihood distribu- 
tions have been placed into 75 bins of widths that are 
equal for the different types of sparticles. They are each 
normalised to an integrated likelihood of 1. The spike 
at low values of the gluino mass corresponds to the h°- 
pole region of mSUGRA. This spike has a good chance 
of being covered at the Tevatron experiments before the 
LHC starts running (66|. It should be noted that up- 
per limits upon the scalar sparticle masses inferred from 
Fig. UK are due largely to the definition of the range of the 
initial parameters (mo being less than 2 TeV) . Neverthe- 
less, it is clear that there is already some preference from 
the combined data for various ranges of sparticle masses 
and upper bounds upon the gaugino masses. For exam- 
ple, values of m g > 3.5 TeV and m s = 400 - 800 GeV 
are disfavoured, as well as rriq L < 800 GeV. In Fig. 2b, 
the mass splitting between the lightest stau and light- 
est neutralino is shown. The insert in the figure displays 
the quasi-degenerate co-annihilation region at low mass 
splittings. The peaked region at m^ — m x 1 GeV is 
likely to be difficult to discern at the Large Hadron Col- 



lider (LHC). One would wish to measure decays of the 
lightest staus in order to check the co-annihilation re- 
gion, but reconstructing a relevant soft r resulting from 
such a decay is likely to prove problematic. On the 
other hand, a linear collider with sufficient energy to 
produce sparticles could provide the necessary informa- 
tion [29|, [63] . The predicted likelihood distribution of the 
B s — » branching ratio is shown in Fig. 2b. Possi- 

ble values for this observable were found with a random 
scan of unconstrained MSSM parameter space in Ref. [68[ 
(no likelihood distribution was given) . The region to the 
right hand side of the vertical line is ruled out from the 
combined CDF/DO 95%C.L. exclusion [6§, [z3 [13| limit 
BR(B° S -> < 3.4 x 10~ 7 . We have not cut points 

violating this constraint, but this has negligible effect 
since there are only a small number of them. The 2d 
marginalisation of the branching ratio versus Mi/ 2 shows 
a peak at very low M 1 / 2 values, as Fig.0bl displays. This 
indicates that the spike in Fig. 2b at branching ratios of 
about 10~ 8,6 is due to the h°-po\e region. Lowering the 
empirical upper bound on the B s — > branching 
ratio will significantly cut into the allowed mSUGRA pa- 
rameter space. A significant lowering of the bound upon 
this branching ratio is expected in the coming years from 
the Tevatron experiments and from LHCb. For exam- 
ple, it is estimated (6(| that the Tevatron could exclude 
a branching ratio of more than 2 x 10~ 8 with 8 fb _1 
of integrated luminosity. This corresponds to ruling out 
29% of the currently allowed likelihood density. Predic- 
tions for BR(B S — » n + (x~) were correlated with those for 
(g - 2) M in Ref. [72] in mSUGRA. For a given mSUGRA 
parameter point, a correlation was shown when tan /3 was 
varied. The authors conclude that for high tan/3, an en- 
hancement of BR(B S — ► /i + /i~) is implied by the (<? — 2) M 
measurements. We re-examine this statement in view of 
the full dimensionality of the mSUGRA parameter space 
in Fig. 2b- The correlation is seen to be far from per- 
fect, the likelihood distribution being highly smeared in 
terms of the two measurements. Nevertheless, there is 
a mild correlation between BR(B S — > /i + /i~) and the 
SUSY contribution to (g — 2) M at the highest likelihoods, 
as evidenced by the bright oblique stripe in Fig. 0b- In 
Fig. Hf, we show the likelihood distribution in the tan /3- 
rriA plane. There is a significant amount of likelihood 
density towards the top left-hand side of the plot, where 
the Tevatron is expected to have sensitivity [66j (cover- 
ing tan/3 > 40 for tua < 240 GeV for 8fb _1 of integrated 
luminosity) . LHC experiments should be able to observe 
the A for the entire high likelihood range 12]. 

We now ask what are the likelihoods of the different 
regions of relic density depletion in mSUGRA parameter 
space. In order to sharply delineate the regions, we de- 
fine them as follows: the co-annihilation region is defined 
such that m x o is within 10% of mf x . The h /A°-pole re- 
gions have 2m x o within 10% of m^o, jtiao respectively. 
Stop co-annihilation requires a broader definition: it is 
defined such that m x o is within 30% of m^^since the an- 
nihilation is so much more efficient [HI, [24|, [25| than in the 
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FIG. 3: Likelihood distributions of masses in mSUGRA. The graphs show the likelihood distributions marginalised down to 
2d. The likelihood (relative to the likelihood in the highest bin) is displayed by reference to the bar on the right hand side of 
each plot. The contours show the 68% and 95% confidence level limits. 



TABLE IV: Likelihood of being in a certain region of 
mSUGRA parameter space. 



Region 


likelihood 


h° pole 

A pole 
f co-annihilation 
t co-annihilation 


0.02±0.01 
0.41±0.03 
0.27±0.04 
(2.1 ±4.8) x 



other regions. A better defined procedure might perhaps 
be to determine regions on the basis of the dominant an- 
nihilation mechanism, but since we are only looking for 
a rough indication of the region involved, the procedure 
adopted here will suffice. Points that fall in between any 
of the sharp definitions are either from the bulk region 
or in the smaller tails of the likelihood distribution. 

The likelihoods of these regions are shown in Table [TV] 
We estimate the uncertainty by calculating the standard 
deviation on the 9 independent Markov chain samples. 
The quoted error thus reflects an uncertainty due not to 
experimental errors, but to a to finite simulation time of 



the Markov chain. We see that the h -pole region has a 
relatively low likelihood whereas for the A°-po\e and r 
co-annihilation regions the likelihood is larger. From the 
table, we see that the i-co-annihilation region, although 
uncertain due to the low statistics, is negligible, and we 
now investigate why this is the case. 

The suppression of the stop-co-annihilation region 
comes from essentially two effects: firstly, as already 
apparent from Ref. [25| . finding a suitable stop co- 
annihilation region which is compatible with both the 
(g — 2) M and BR[b — > sj] measurements is problematic. 
Secondly, the central value of m t has come down since 
ref. [25l | . The dominant radiative corrections to m h o are 
highly correlated with m t [5l| , with the consequence that 
the lower predicted Higgs mass now rules out more of the 
stop co-annihilation region. We illustrate these points in 
Fig. [5] along the too direction for given values of the other 
mSUGRA parameters (stated in the caption) . In Fig. [5k., 
we plot the fractional stop-neutralino mass splitting A 
alongside the neutralino relic density VtuMh 2 . We see 
that the fractional mass splitting takes values between 
0.1 and 0.23 in the range of mo shown. This is the stop- 
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FIG. 4: (a) Selected sparticle mass likelihood distributions in mSUGRA, (b) stau-neutralino mass difference likelihood distri- 
bution where the insert shows a blow-up of the quasi mass-degenerate region, (c) branching ratio for the decay B s — > n + [i~ . 
The Tevatron upper bound is displayed by a vertical line, (d) Likelihood density marginalised to the 2d plane BR(B S — > fi~) 
versus My^. (e) Correlation between BR(B S — » /i + fi~) and (g — 2) M . (f) Likelihood marginalised to the tan (5- Ma plane. The 
contours show the 68% and 95% confidence level limits. 



co-annihilation regime, and we see that around mo ~ 786 
GeV, A ~ 0.2 corresponding to iljjMh 2 roughly compat- 
ible with the WMAP constraint. Unfortunately, we also 
see that the lightest CP-even Higgs mass is predicted to 
be 110.3 GeV here and is ruled out by the LEP2 Higgs 
limits shown in Table HTT1 (sin 2 (/3 — a) — 1.0) for this 
range of parameters). This problem is remedied by going 
to higher values of mt, since m^o then goes up, but of 



course this comes with an associated penalty in the likeli- 
hood from being away from the empirically central values 
of m t . In Fig. [5)3, we display predictions for BR[b — > sj] 
and 8{g — 2)^/2 along the chosen range for m . These 
predictions are both lower than the empirically derived 
constraints in the region where the dark matter relic den- 
sity is in accordance with the WMAP constraint: includ- 
ing errors in the theoretical prediction as described in 
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FIG. 5: A line through the stop co-annihilation region in mSUGRA: tan/3 = 10, Mj/ 2 = 350 GeV, A = -2000 GeV and 
central Standard Model inputs, (a) dark matter relic density, fractional stop-neutralino mass splitting A = (m^ — m x o)/m x o 

and light Higgs mass. The horizontal lines show the la WMAP-derived limits upon Q,Dhih 2 derived from Eq. [4] (b) muon 
magnetic moment and BR[b — > sy] predictions. 



the previous section, BR[b — > sj] is 1.95<7 lower than the 
central value and S(g — 2) M /2 is l.Qa lower. It turns out 
that these predictions are not very sensitive to changes in 
rrit and so their likelihood penalty tends to apply for the 
higher values. However, lower values of (— Aq) require 
lower mo in order to fit the dark matter constraint and 
pick up a bigger likelihood constraint from the egregious 
prediction of BR[b — > sj]. These findings are in rough 
agreement with those of Ref. [25[ except for the more re- 
strictive Higgs bounds, which are a consequence of the 
lower experimental value of m t . Ref. [25| only applies 
2(7 bounds on both (g — 2) M and BR[b — > sj], whereas 
our results take into account the likelihood penalty paid 
by the fact that neither prediction is close to its cen- 
tral value near the stop co-annihilation region, which 
then becomes disfavoured compared to the other regions 
(where an almost perfect fit is possible). There is also 
a volume effect: in Table IIV1 the likelihoods we calcu- 
late are integrated over the relevant region. Thus regions 
that are very small, such as the stop co-annihilation re- 
gion, will tend to have a smaller likelihood than other, 
larger regions. In analyses in following sections, stop co- 
annihilation also turns out to have negligible likelihood 
and so we will neglect it from the results. 

As an aside, we note that the decay chain — > x% * 
Ir — ► xl exists with a likelihood of 0.24±0.04 (this num- 
ber is just based upon the mass ordering and does not 
take into account the branching ratio for the chain). The 
existence of such a chain allows the extraction of several 
functions of sparticle masses from kinematic end-points 
and they have been used in many LHC analyses, for ex- 
ample Refs. [H,[zi,[zi- 



TABLE V: Likelihood of being in a certain region of 
mSUGRA parameter space including theoretical uncertain- 
ties in the sparticle spectrum calculation. 



Region 


likelihood 


h° pole 
A pole 
co- annihilation 


0.03±0.01 
0.41±0.05 
0.26±0.08 



IV. THEORETICAL UNCERTAINTY 

Theoretical uncertainties in the sparticle mass pre- 
dictions have been shown to produce non-negligible ef- 
fects in fits to data (7f|, including fits to the relic den- 
sity [ill [3(| [Zfl- I n this section, we perform a second 
MCMC analysis taking theoretical uncertainty into ac- 
count in order to estimate the size of its effect. SDFTSUSY 
performs the Higgs potential minimisation then calcu- 
lates sparticle pole masses at a scale Msusy = ^ m t 1 m t 2 ■ 
This scale is chosen because it is hoped that loop correc- 
tions to the pole mass corrections that are not yet taken 
into account (typically two-loop corrections) are small at 
this scale. In order to estimate the size of theoretical 
uncertainties, we vary this scale by a factor of two in ei- 
ther direction (but it is always constrained to be greater 
than Mz)[13|. Implementation of the uncertainty in the 
MCMC algorithm is simple: we simply add an input pa- 
rameter x which is bounded between 0.5 and 2, giving 
the factor by which Msusy is to be multiplied. The 
MCMC is then re-run as before and explores the full 8d 
parameter space (including x) accordingly. 

Such a procedure automatically takes into account the 



12 



correlations in predictions due to correlated theoretical 
uncertainties in the sparticle mass predictions. The like- 
lihoods of the three mSUGRA regions are shown in Ta- 
ble [V] The likelihoods are approximately equal to those 
for the 7d case, as a comparison to Table HVl indicates. 
In fact, comparing results and plots produced with and 
without theoretical uncertainties, the results are gener- 
ally very similar. This indicates that the theoretical un- 
certainties don't make a huge difference to the Id and 
2d marginalisations compared to those coming from the 



data. The decay chain q L — > X2 ~~ * 'fi — * Xi nas a live- 



lihood of 0.22 ± 0.08, not significantly different to the 
case when theoretical uncertainties were not taken into 
account. 

We show two of the 2d marginalised likelihoods in 
Figs. [H^EDs. Comparing with Fig. [2J we see that Fig. [S] 
shows no significant effects deriving from theoretical er- 
rors. Marginalising mass likelihood distributions down 
to Id (as in Fig. for example), one obtains distribu- 
tions that are identical to the 7d case except for small 
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statistical fluctuations. 



V. OTHER SOURCES OF DARK MATTER 

For completeness, we may ask how robust our results 

are with respect to the assumption that non thermal- 

neutralino contributions to firjM & re negligible. Thus, 

we allow the predicted amount of thermal neutralino relic 

density Pn DM h 2 (assuming some mSUGRA point s) to 

be some fraction of the total relic predicted relic density 
Qtot h 2. 
DM' ■ 



Pn DM h 2 

Qtot U2 ■ 



< A < 1. 



(24) 



Thus, the total amount of dark matter predicted is 
Pn DM h 2 1 X and, assuming a flat pdf for A as shown in 
Fig-EK we obtain the likelihood penalty for a given SUSY 
dark matter prediction 



exp 



2sn DMh 2(Xy 



(25) 



where 

SfW^(A) = 0.0Q81 Q(pn DM h*/^-mQ DMh 2) + 

0.0091 9(-p QDuh 2/X + m QDMh 2) (26) 

in accordance with the asymmetric errors in Eq. 2J 9{x) 
is the Heavisde step function, 6{x) — 1 for all x > 0, 
8(x) — for all x < 0. We calculate Eq. [25] numer- 
ically and display it in Fig. [7t>, where it is contrasted 
with the old dark matter likelihood penalty that assumes 
that all dark matter is of thermal neutralino origin. The 
figure shows that if the relic density is too high, a se- 
vere likelihood penalty applies (similar to the "no extra 
DM allowed" case) but a much less severe penalty ap- 
plies if the prediction is below the central value of the 
observed WMAP value in Eq. 2J The additional contri- 
bution to flh 2 is assumed to be provided by some non 
thermal-neutralino source (late decays or hidden sector 
dark matter for example) [39|]. The rest of the analysis 
proceeds exactly as in section ILTT1 (i.e. without simultane- 
ously taking theoretical uncertainty into account). 

The MCMC algorithm turns out to be much more ef- 
ficient once we drop the assumption that the cold dark 
matter consist only of neutralinos: 19.9% efficiency was 
achieved compared to 4.1%. One consequence of this 
is that statistical fluctuations in the results are smaller. 
The likelihoods in each region are shown in Table IVII 
A comparison of Tables IIVIVII shows that annihilation 
through A pole has acquired a significantly larger like- 
lihood through allowing for other forms of dark mat- 



TABLE VI: Likelihood of being in a certain region of 
mSUGRA parameter space including possible an additional 
contribution from non thermal-neutralino dark matter. 



Region 


likelihood 


h° pole 
A pole 
co- annihilation 


0.04±0.01 
0.52±0.02 
0.14±0.02 



ter. The higgs-pole and co- annihilation regions are still, 
within statistics, compatible with their previous likeli- 
hoods. All of the listed uncertainties have decreased, 
due to the additional efficiency. 

Fig. [S^i shows the likelihood map marginalised to the 
Mx/2 — mo plane. Comparing it to Fig. [2^, we see a very 
similar picture except for the fact that the higher volume 
of likelihood in the A -pole region is evident. The same 
comment can be made of all of the plots analogous to the 
ones in to Fig. [2J we show the likelihood marginalised 
to the mo — Aq plane in Fig. [5] as an example. There 
are no other qualitative changes in any of the plots, and 
indeed the likelihood distributions marginalised to the 
A Q — tan/3 and M 1 / 2 — A planes (analogous to Figs.[5b,f) 
are identical by eye, except for being smoother due to 
the increased efficiency. Likelihoods of sparticle masses 
also look the same, except for the spike in the gluino 
mass, which has twice as much likelihood. As mentioned 
before, this spike is due mainly to the light h°-pole re- 
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FIG. 6: Likelihood maps of mSUGRA parameter space including theoretical uncertainty. The graphs show the likelihood 
distributions sampled from 8d parameter space and marginalised down to two. The likelihood (relative to the likelihood in 
the highest bin) is displayed by reference to the bar on the right hand side of each plot. The contours show the 68% and 95% 
confidence level limits. 
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gion which is subject to relatively large fluctuations, as 
Tables IIVIVII illustrate. We cannot conclude that the 
/i°-pole region obtains more integrated likelihood by ad- 
mitting non thermal-neutralino components to the relic 
density because the statistics in the MCMC algorithm 
are not high enough. 

One distribution that does significantly change shape 
is that of BR(B S — > /j + /i~). We show its marginalised 
distribution after dropping the lower likelihood penal- 
ties on VloMh 2 from the MCMC algorithm procedure 



in Fig. [HI as the histogram marked "extra DM allowed" . 
For the purpose of comparison, the default calculation 
where we assume all dark matter to be thermal neutrali- 
nos from Fig.UJ: is also displayed, being marked "no extra 
DM allowed" . Comparing the two distributions, we see a 
broader distribution due to the enhanced ^4°-pole annihi- 
lation region when additional components are allowed in 
the fit. The >l -pole region has higher tan/?, and there- 
fore higher values for the branching ratio. The estimated 
amount of likelihood that could be covered by Tevatron 
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FIG. 8: Likelihood maps of mSUGRA parameter space allowing for non thermal-neutralino contributions to the dark matter 
relic density. The graphs show the likelihood distributions sampled from 7d parameter space and marginalised down to two. 
The likelihood (relative to the likelihood in the highest bin) is displayed by reference to the bar on the right hand side of each 
plot. The contours show the 68% and 95% confidence level limits. 
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FIG. 9: Investigation on the effects of allowing for a non 
thermal-neutralino component of dark matter in the branch- 
ing ratios for the decay B a — > fiT in mSUGRA. The current 
Tevatron upper limit is displayed by a vertical line. 



measurements with 8 fb" 1 of integrated luminosity in- 
creases by 6% to 35% of the currently allowed density 
due to the presence of non thermal-neutralino dark mat- 
ter contributions. 



VI. CONCLUSIONS 

Previous studies of mSUGRA in the context of dark 
matter and particle physics constraints have tended to 



not use the full dimensionality of the parameter space, 
and to have put hard 95% (C.L.) limits on predictions. 
Here, in the full dimensionality of parameter space, we 
include all of the information in a likelihood fit, so that 
violating one constraint slightly might be traded against 
fitting another one better in a consistent manner. Al- 
though there is plenty of qualitative information about 
possible dark matter annihilation regions in mSUGRA in 
the literature, this paper gives a quantitative calculation 
of the likelihood distributions in the full dimensionality 
of the parameter space. However, the most important 
contribution of this paper lies in the implications of the 
results to particle physics. 

We have successfully employed an MCMC algorithm 
to provide likelihood maps of the full 7d input param- 
eter space of mSUGRA. By using a statistical test, 
we have shown that the likelihood distributions have 
achieved good convergence before a total of 9xl0 6 sam- 
plings of the likelihood. We have presented the likelihood 
marginalised down to each 2d mSUGRA parameter pair. 
Such plots provide the totality of the current informa- 
tion we have about the model given the experimental 
constraints and are quantitative results. Theoretical un- 
certainties in the sparticle spectrum calculation broaden 
a couple of the distributions a little but do not change 
them radically. 

The main new contribution of this paper is to our 
knowledge of what current constraints on mSUGRA 
mean for particle physics in a quantitative sense. 
Marginalised Id likelihood distributions of quantities 
such as sparticle masses (or mass differences) already 
show some significant structure from the data, providing 
interesting information for future collider searches. In 
particular, the likelihood of the "golden cascade" qL — > 
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X°> — ¥ Xi being kinematically allowed is 24±4%. 

m fl —m x o is peaked below 10 GeV, implying that stau re- 
construction at hadron colliders could be problematic at 
the LHC. Integrating the likelihood density, we find the 
likelihood of m?^ — m x o < 10 GeV is 19.9%. Our likeli- 
hood distributions for BR(B S — > /i + /i _ ) corroborate the 
conclusions of ref. [41]: that current bounds upon the 
branching ratio do not yet place significant constraints 
upon mSUGRA once other constraints have been taken 
into account, but any improvement on the upper bounds 
constrain the currently available parameter space. The 
quantitative results on BR(B S — > fi + (i~) are particu- 
larly important: if the Tevatron experiments can reach 
down to 2xl0~ 8 , they will cover 29% of the mSUGRA 
likelihood, or 35% if we allow the possibility of additional 
contributions to the relic density other than thermal neu- 
tralinos. 

We have shown that the correlation between BR(B S — > 
^ + /i~) and S(g — 2) M noticed in Ref. [72| is much diluted 
once simultaneous variations of all mSUGRA parameters 
are taken into account. The A°-pole annihilation region 
and the stau co-annihilation region each have approxi- 
mately an order of magnitude more likelihood than the 
fr°-pole region. Stop co-annihilation is highly disfavoured 
compared to these other regions due to more restrictive 
Higgs mass constraints coming from a lower value of m% , 
as well as the BR(B S — > fi + fi~) and 6(g — 2) M predic- 
tions. The light ft, — pole region just survives the LEP2 
Higgs mass constraints despite the new reduced top mass 
value albeit with a reduced likelihood (in the usual fre- 
quentist language, it's outside the 95% confidence level 
but not the 99% one) . The light h°-pole region has more 
likelihood if one allows additional non thermal-neutralino 
components of dark matter. 

The analysis of Ref. [ll[ includes My/ and sin 2 9 l e ^ 
in the x 2 statistic, excluding M 1/2 > 1500(600) GeV 
at the 90% confidence level for m t = 178 GeV and 
tan/3 = 50(10) respectively. These numbers are not 
exactly reproduced in the present analysis for several 
reasons: we use a more up-to-date value of m t , we 
vary mt, mt,, a a [Mz) and tan/3 simultaneously with the 
other mSUGRA parameters and we do not include My/, 
sin 2 l e ff in the fit. Indeed, one may wonder about in- 
troducing a posteriori bias as a result of only picking 
these two precision electroweak observables, since they 
are the two that show a preference for a SUSY contribu- 
tion. Other observables would presumably prefer heavier 
SUSY particles. However, My/ and sin 2 9 l e ^ do show a 
preference for lower M 1 / 2 for m t = 178 GeV. Having said 
that, our results are not wildly different, as an examina- 
tion of Fig. [2^ shows. 

We suspect that the MCMC techniques exemplified 
here could be found extremely useful in SUSY fitting pro- 
grams such as FITTIN0 [z3] and SFITTER [78] in order to 
provide a likelihood profile of the parameter space, in- 
cluding secondary local minima. These programs are de- 
signed to fit more general MSSM models than mSUGRA 
to data, with an associated increase in the number of 



free input parameters, so the linear calculating time of 
MCMCs ought to be very useful. 

While our results presented for mSUGRA are in them- 
selves interesting, it is obvious that the method will be 
applicable in a much wider range of circumstances. Once 
new observables become relevant, such as some LHC end- 
points, for example, it would be trivial to include them 
into the likelihood and re-perform the MCMC [79[ . The 
method should be equally applicable to other models, and 
provided enough CPU power is to hand, could provide 
likelihood maps for models with even more parameters 
and/or detailed electroweak fits. 
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APPENDIX A: TOY MODELS AND SAMPLING 

By definition, a sampler able to sample correctly from 
a pdf p(x) must generate a list of x values whose lo- 
cal density, at large step-numbers, is proportional to the 
probability density p(x) at each part of the space (in the 
preceding parts of the paper we have set this pdf to be the 
likelihood) . The value of the constant of proportionality 
between the probability density and the local density of 
x values is unimportant, but the key point is that it is 
constant across the whole space. 

It can sometimes be hard to implement a good sam- 
pler for a given probability distribution. In fact it is often 
easier to invent an algorithm which generates a sequence 
of x values, and which may superficially resemble a sam- 
pler, but which lacks constancy of proportionality over the 
space. We might call such algorithms pseudo-samplers as 
their output can sometimes resemble that of a true sam- 
pler, provided that the variation in proportionality is not 
too great across the space considered. Pseudo-samplers 
are sometimes useful (e.g. as a means of exploring a mul- 
tidimensional space in which case sample density may be 
neither interesting nor the end product of the analysis). 

In the present and in many other papers, however, 
sample density represents confidence in some particular 
part of parameter space, and is the final product of the 
analysis. Extreme care, then, must be taken to ensure 
that any creative modifications to established Markov 
Chain sampling techniques do not break the principle 
of detailed balance - the test which ensures that the al- 
gorithm remains a true sampler rather than a pseudo- 
sampler. 

It is often desirable for Metropolis-Hastings type sam- 
plers to have an efficiency of about 25% for the accep- 
tance of newly proposed points. Efficiencies much smaller 
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than this may suggest that the proposal distribution is 
too wide and is too often proposing jumps to undesir- 
able locations far away from the present point, leading 
to large statistical fluctuations in the result. Efficiencies 
much larger than this can be indicative of proposal dis- 
tributions which are too narrow and may take too many 
steps to be practical to random-walk from one side of a 
region of high probability to the other. It is tempting, 
therefore, to adapt the present step size (i.e. proposal dis- 
tribution width) on the basis of recent efficiency. With a 
couple of toy examples, however, we will illustrate that 
this is a dangerous path to follow, and we will demon- 
strate that it break the principle of detailed balance, and 
thus turns the Markov Chain algorithm from a sampler 
into a pseudo-sampler. 

We take as our example the method used by Baltz 
and Gondolo [44( which is designed to keep the target 
efficiency of the authors' Markov Chain close to 25%: 

1. Double the current step size if the last three pro- 
posed points were all accepted 

2. Halve the current step size of the last seven pro- 
posed points were all rejected. 

We will refer the the above method as "the adaptive al- 
gorithm" and show that it fails to sample correctly from 
some simple distributions, a signature of detailed balance 
being broken. 

Take, for example, the 1-dimensional double Gaussian 
probability distribution pdGauix) defined by 



PdGau(x) oc g(x, 0, 1) + g(x, 10, 1/2), 



(Al) 



where g(x,m,a) — exp(— (x — m) 2 /(2a 2 )) is a Gaus- 
sian distribution of unit height and width a centred on 
x = to. Note that the Gaussian near the origin is wider 
than the Gaussian near x — 10. This pdf mocks up 
the approximate situation along the Mi/ 2 direction in 
mSUGRA for large too, as Fig. [5^ shows. The narrow 
Gaussian would then correspond to the light /i°-pole re- 
gion, which is quite disconnected from the wider A°-po\e 
region. A Metropolis-Hastings algorithm (see section HI]) 
with a fixed Gaussian proposal pdf Q(x) of width 5 was 
run for 2 000 000 steps and the binned result is shown 
in Figure 110a . It reproduces the target distribution very 
well. In contrast, Figure [T0b shows what happens when 
the adaptive algorithm is run onpdGauix). Clearly the re- 
sult is very different to that in Figure fTUa and is thus very 
wrong. The narrow Gaussian has been sampled many 
times more frequently than it should have been relative 
to the wider one near the origin. 

The adaptive algorithm ensures that whenever the cur- 
rent point is in one of the two Gaussian regions, the step 
size is adjusted to be proportional to the width of that 
region. This adaptation is not immediate (seven succes- 
sive rejections must occur before the step size is halved) 



but suppose for the moment that adaptation were to take 
place almost immediately For the moment, let us also 
neglect random fluctuations of the step size. In such a 
limit, when the current point is in the left-hand Gaussian 
region, the step size is double what it is when the cur- 
rent point is in the right-hand region. Making a proposal 
for a jump from the region on the left to the region on 
the right, therefore, is something like a 10-sigma event. 
In contrast, making the reverse proposal (from right to 
left) is more like a 20-sigma event. In this limit, it is 
thus e^ 20 ) - ( 10 ) )/ 2 = e 150 times more likely that jumps 
to the right get proposed than jumps to the left. This 
is a vast overestimate of the bias toward the narrow re- 
gion for two reasons. Firstly, step size adaptation is not 
immediate (even when the step size is half of what it 
should be, quite a few steps occur before seven succes- 
sive rejections). Secondly and more importantly, even 
when settled in one of the two regions, the adaptive step 
size makes excursions about its mean value. Excursions 
to very high step sizes (double or quadruple the average 
step size) are infrequent but still occur. When they do, 
they elevate the chance of proposing a jump from one re- 
gion to another, and help to equilibrate between the two 
regions. Both of these effects reduce the bias favouring 
the narrow region, but still break detailed balance, and 
overall the right hand Gaussian is still sampled about 10 
times more frequently than it should be. 

For quite a different example, consider the truncated 
Cauchy distribution defined by 



Pcauchy(%*) ^ 



1/(1 + a; 2 ) if x > -500 and x < 50 







otherwise. 



^2) 



A faithful sampling from this distribution is shown in Fig- 
ure lllf a) , and a mis-sampling using the adaptive algo- 
rithm is shown for comparison in Figure fTlT b). Although 
there is better correspondence between the samples gen- 
erated by the two methods than was the case earlier, 
it is nevertheless evident that there are large differences 
between the degrees to which the two methods have sam- 
pled the tails of the distribution. As a consequence, sam- 
ples from the adaptive method have a root-mean-squared 
value of 7.6 compared to the value of 17.9 obtained by 
the true sampling. The cause of the discrepancy is again 
due to the very different scales in the cauchy distribu- 
tion. Its core is very narrow with a width of order 1, but 
there are significant parts of probability also lodged in 
the tails many orders of magnitude away. The adaptive 
method has to raise and lower the step size frequently 
to sample from the whole dynamic range, and in doing 
so it has to break detailed balance many times, resulting 
in the cumulative effect of an overall order of magnitude 
bias in the tails. 
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(a) 




(b) 
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FIG. 10: Binned samples of the double Gaussian distribution pdGau(x). The normalisation is arbitrary and has no relevance 
here, (a) uses a Metropolis-Hastings algorithm and yields a good approximation whereas (b) uses the adaptive algorithm. 
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FIG. 11: Binned sampling of a Cauchy distribution 1/(1 + x 2 ) (shown in solid red line) truncated at x — ±500. (a) shows the 
result of direct sampling and yields a good approximation whereas (b) uses the adaptive algorithm. 
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